6 research outputs found

    A matter of words: NLP for quality evaluation of Wikipedia medical articles

    Get PDF
    Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specific domain features to improve the results of the evaluation of Wikipedia medical articles. In particular, we evaluate the articles adopting an "actionable" model, whose features are related to the content of the articles, so that the model can also directly suggest strategies for improving a given article quality. We rely on Natural Language Processing (NLP) and dictionaries-based techniques in order to extract the bio-medical concepts in a text. We prove the effectiveness of our approach by classifying the medical articles of the Wikipedia Medicine Portal, which have been previously manually labeled by the Wiki Project team. The results of our experiments confirm that, by considering domain-oriented features, it is possible to obtain sensible improvements with respect to existing solutions, mainly for those articles that other approaches have less correctly classified. Other than being interesting by their own, the results call for further research in the area of domain specific features suitable for Web data quality assessment

    Social Software and Semantics for Business Process Management - Alternative or Synergy?

    Get PDF
    Business Process Management (BPM) provides support for managing organizations’ processes and facilitates their adaptation to changing market conditions. Although various BPM solutions have been successfully applied in industry, there are still many open issues to be addressed, e.g., ensuring commitment of employees in process modelling and reengineering or enabling automation of business processes lifecycle. Researchers are currently investigating the use of Semantic Web and Social Software technologies to overcome the existing problems. Based on the conducted study, we argue that although semantics and Social Software technologies focus on different problems, they may be combined as utilized together they enable organizations to advance their processes and adapt faster to changing market conditions

    Graph Kernels for Task 1 and 2 of the Linked Data Data-Mining Challenge 2013

    No full text
    In this paper we present the application of two RDF graph kernels to task 1 and 2 of the linked data data-mining challenge. Both graph kernels use term vectors to handle RDF literals. Based on experiments with the task data, we use the Weisfeiler-Lehman RDF graph kernel for task 1 and the intersection path tree kernel for task 2 in our final classiers for the challenge. Applying these graph kernels is very straightforward and requires (almost) no preprocessing of the data

    A Fast and Simple Graph Kernel for RDF

    No full text
    In this paper we study a graph kernel for RDF based on constructing a tree for each instance and counting the number of paths in that tree. In our experiments this kernel shows comparable classification performance to the previously introduced intersection subtree kernel, but is significantly faster in terms of computation time. Prediction performance is worse than the state-of-the-art Weisfeiler Lehman RDF kernel, but our kernel is a factor 10 faster to compute. Thus, we consider this kernel a very suitable baseline for learning from RDF data. Furthermore, we extend this kernel to handle RDF literals as bag-ofwords feature vectors, which increases performance in two of the four experiments
    corecore